Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor media type sniffing #377

Merged
merged 24 commits into from
Aug 22, 2023
Merged

Refactor media type sniffing #377

merged 24 commits into from
Aug 22, 2023

Conversation

mickael-menu
Copy link
Member

@mickael-menu mickael-menu commented Aug 20, 2023

  • MediaType is now a simple class like Url and Href with no additional information about the format (file extension, name).
    • Rationale: It was too easy to build a non-canonical media type with missing information, just from parsing a string representation.
    • It is still useful to compare media types and extract the components from the string representation.
  • A new MediaTypeHints bundles fileExtensions and mediaTypes hints together, to avoid the explosion of APIs for all the various hints combinations, and also to make it clear when a mediaTypes property is actually for hints.
  • A FormatRegistry is used to register known media type file extensions.
    • By default, only actual publication and license formats are registered, because we don't need to declare file extensions and names for every single media types (JPEG, MP3, etc.).
  • MediaTypeRetriever is simplified to delegate the asset opening to the AssetRetriever.
    • Now, many components (HTTP client, resource, container, publication factory, etc.) depend on a MediaTypeRetriever for fine-tuned sniffing and then provide a canonical mediaType as output. Apps are encouraged to use this instead of relying on a central MediaTypeRetriever supporting all cases.
      • Rationale: These components know better what kind of hints are available and what kind of assets they are manipulating (container, resource, etc.) + no more dependency on Resource / Container from the media type "module".

@mickael-menu mickael-menu requested a review from qnga August 20, 2023 10:50
@@ -118,7 +119,7 @@ class ReaderRepository(
val readerInitData = when {
publication.conformsTo(Publication.Profile.AUDIOBOOK) ->
openAudio(bookId, publication, initialLocator)
publication.conformsTo(Publication.Profile.EPUB) ->
publication.conformsTo(Publication.Profile.EPUB) || publication.readingOrder.allAreHtml ->
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To support Readium Web Publication with HTML resources such as https://readium.org/webpub-manifest/examples/MobyDick/manifest.json

?.let { File(it).extension }
?: "tmp"
val file = File(dir, "$filename.$extension")
val file = File(dir, "$filename.${extension(context)}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension was missing from the URL in the case of a content scheme. This made the format sniffing more expensive.

@mickael-menu mickael-menu marked this pull request as ready for review August 21, 2023 09:24
@mickael-menu mickael-menu requested a review from qnga August 22, 2023 11:50
@qnga qnga merged commit b95102c into remove-fetcher Aug 22, 2023
3 checks passed
@qnga qnga deleted the remove-fetcher-mediatype branch August 22, 2023 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants